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ROBUST  MULTIGRID  ALGORITHMS  FOR  THE  INCOMPRESSIBLE  NAVIER-STOKES 

EQUATIONS* 

RUBEN  S.  MONTEROt  AND  IGNACIO  M.  LLORENTE* 

Abstract.  Anisotropies  occur  naturally  in  CFD  where  the  simulation  of  small  scale  physical  phenomena, 
such  as  boundary  layers  at  high  Reynolds  numbers,  causes  the  grid  to  be  highly  stretched  leading  to  a  slow 
down  in  convergence  of  multigrid  methods.  Several  approaches  aimed  at  making  multigrid  a  robust  solver 
have  been  proposed  and  analyzed  in  literature  using  the  scalar  diffusion  equation.  However,  they  have 
been  rarely  applied  to  solving  more  complicated  models,  like  the  incompressible  Navier-Stokes  equations. 
This  paper  contains  the  first  published  numerical  results  of  the  behavior  of  two  popular  robust  multigrid 
approaches  (alternating-plane  smoothers  combined  with  standard  coarsening  and  plane  implicit  smoothers 
combined  with  semi-coarsening)  for  solving  the  3-D  incompressible  Navier-Stokes  equations  in  the  simulation 
of  the  driven  cavity  and  a  boundary  layer  over  a  flat  plate  on  a  stretched  grid.  The  discrete  operator  is 
obtained  using  a  staggered-grid  arrangement  of  variables  with  a  finite  volume  technique  and  second-order 
accuracy  is  achieved  using  defect  correction  within  the  multigrid  cycle.  Grid  size,  grid  stretching  and 
Reynolds  number  are  the  factors  considered  in  evaluating  the  robustness  of  the  multigrid  methods.  Both 
approaches  yield  large  increases  in  convergence  rates  over  cell-implicit  smoothers  on  stretched  grids.  The 
combination  of  plane  implicit  smoothers  and  semi-coarsening  was  found  to  be  fully  robust  in  the  flat  plate 
simulation  up  to  Reynolds  numbers  106  and  the  best  alternative  in  the  driven  cavity  simulation  for  Reynolds 
numbers  above  103.  The  alternating-plane  approach  exhibits  a  better  behavior  for  lower  Reynolds  numbers 
(below  to  103)  in  the  driven  cavity  simulation.  A  parallel  variant  of  the  smoother,  tri-plane  ordering,  presents 
a  good  trade-off  between  convergence  and  parallel  properties. 

Key  words,  plane  implicit  smoothers,  symmetric  coupled  Gauss-Seidel,  robust  multigrid,  defect  correc¬ 
tion,  Navier-Stokes 

Subject  classification.  Applied  and  Numerical  Mathematics 

1.  Introduction.  Multigrid  techniques  are  generally  accepted  as  fast  and  efficient  methods  for  solving 
many  types  of  partial  differential  equations,  and  particularly  elliptic  problems  whose  discretization  results 
in  a  K-matrix  [24].  For  this  kind  of  problems,  basic  point- wise  iterative  methods,  like  Gauss-Seidel  or 
damped  Jacobi,  are  good  smoothers,  and  multigrid  methods  exhibit  an  optimal  complexity  (work  is  linearly 
proportional  to  the  number  of  unknowns),  optimal  memory  requirements,  and  good  parallel  efficiency  and 
scalability  in  parallel  implementations  [11]. 

However  the  efficiency  of  the  multigrid  methods  degenerates  dramatically  in  presence  of  anisotropies. 
It  is  well  known  that  in  the  resolution  of  the  Poisson  equation  the  convergence  factor  of  the  multigrid 
method  tends  to  one  as  the  anisotropies  are  increased  [1].  Typically  these  anisotropies  might  occur  when 
the  coefficients  of  the  discrete  operator  vary  throughout  the  domain  or  when  stretched  grids  are  used. 
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This  anisotropic  condition  occurs  naturally  in  the  field  of  Computational  Fluid  Dynamics  (CFD)  where  the 
simulation  of  small  scale  physical  phenomena,  such  as  boundary  layers  at  high  Reynolds  numbers,  causes 
the  grid  to  be  highly  stretched  leading  to  a  slow  down  in  convergence. 

In  some  situations,  when  the  source  of  the  anisotropy  is  known  beforehand,  a  block  implicit  smoother 
can  be  used  to  improve  the  efficiency  of  the  multigrid  algorithm.  Usually  this  is  done  by  applying  a  implicit 
solver  in  the  directions  of  strong  coupling,  as  states  Brandt’s  fundamental  block  relaxation  rule  [1].  This 
technique  is  common  practice  in  CFD.  Thomas,  Diskin  and  Brandt  [20]  have  demonstrated  the  efficiency 
of  the  distributive  smoothing  scheme  with  line  solvers  applied  to  high  Reynolds  number  simulations  when 
the  grid  stretching  is  normal  to  the  body.  The  benefits  of  plane  relaxation  are  shown  by  Oosterlee  in  [17] 
for  simulations  of  the  3-D  incompressible  Navier-Stokes  equations  over  grids  with  non-unitary  aspect  ratios. 
Also  a  combination  of  line  implicit  techniques  and  semi-coarsening  has  been  successfully  used  by  Mavriplis 
in  [4]  and  [5]  to  solve  high  Reynolds  number  2-D  and  3-D  viscous  flows  over  anisotropic  unstructured  meshes. 

However,  in  a  general  situation  the  nature  of  the  anisotropy  is  not  known  beforehand,  so  there  is  no 
way  of  knowing  which  of  the  variables  are  coupled.  Moreover,  if  the  problem  is  solved  on  a  stretched 
grid  or  the  equation  coefficients  differ  from  each  other  throughout  the  domain  (computational  and  physical 
anisotropy  respectively)  the  values  of  the  coefficients  and  their  relative  magnitudes  vary  for  different  parts  of 
the  computational  domain.  In  such  cases  the  multigrid  techniques  based  on  point-  or  plane-wise  smoothers 
combined  with  full  coarsening  fail  to  smooth  error  components  with  the  consequent  deterioration  of  the 
multigrid  convergence  factor. 

Several  approaches  aimed  at  making  multigrid  a  robust  solver  have  been  proposed  in  literature.  One 
popular  approach  is  to  use  standard  coarsening  combined  with  an  alternating-direction  implicit  smoother 
[19,  10,  12,  19].  This  solution  consists  in  exploring  all  the  possibilities  in  order  to  develop  a  robust  smoother, 
i.e.  use  alternating-line  relaxation  in  2-D  and  alternating-plane  relaxation  in  3-D.  Another  approach  to 
dealing  with  anisotropic  problems  is  to  combine  an  implicit  smoother  with  an  appropriate  semi-coarsening 
procedure  [6,  18].  This  is  rather  popular  in  literature  and  overcomes  some  parallelization  problems  that  can 
be  found  in  the  alternating-plane  smoothers  [13].  For  example,  a  simple  way  to  avoid  using  an  alternating- 
plane  smoother  is  to  use  semi-coarsening  in  one  direction  and  relaxation  in  a  fixed  plane  (e.g.  combine 
xy-plane  relaxation  with  Z  semi-coarsening).  Other  intermediate  alternatives  that  combine  plane,  line  or 
point  relaxations  with  partial  and  full  coarsening  have  also  been  presented  in  multigrid  literature  [14,  16]. 

Some  of  these  robust  multigrid  approaches  have  also  been  tested  for  the  efficient  resolution  of  the  2- 
D  Navier-Stokes  equations.  The  alternating-direction  line  smoother  has  been  investigated  for  the  solution 
of  the  incompressible  2-D  Navier-Stokes  equations  in  [21,  15].  However,  to  the  authors’  knowledge,  the 
robust  multigrid  algorithms  have  never  been  applied  to  the  resolution  of  the  3-D  incompressible  Navier- 
Stokes  equations.  The  aim  of  this  work  is  to  present  a  thorough  study  of  the  application  of  two  common 
robust  multigrid  algorithms  (alternating-plane  smoothers  combined  with  standard  coarsening  and  plane 
smoothers  combined  with  semi-coarsening)  to  the  resolution  of  the  3-D  Navier-Stokes  equations  on  single¬ 
block  structured  grids. 

The  robustness  of  a  smoother  is  defined  as  its  ability  to  efficiently  solve  a  wide  range  of  problems.  In 
this  sense  the  definition  of  robustness  is  qualitative  and  has  to  be  defined  more  precisely  by  setting  up  a  set 
of  suitable  test  problems.  Traditionally,  the  above  mentioned  approaches  have  shown  to  be  robust  smoothers 
for  the  anisotropic  diffusion  equation.  In  the  present  context  we  will  characterize  the  multigrid  algorithms  as 
robust  if  the  solution  of  the  governing  system  of  equations  can  be  attained  in  a  fixed  amount  of  work  units 
(time  to  compute  the  system  metrics  in  the  finest  level)  independent  of  the  grid  size,  grid  stretching  factor 
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and  Reynolds  number.  This  is  equivalent  to  saying  that  the  convergence  factor  of  the  multigrid  algorithm 
is  independent  of  the  grid  size,  stretching  and  Reynolds  number.  We  will  also  refer  to  this  property  as 
textbook  multigrid  convergence  (TMC).  We  believe  that  the  achievement  of  a  textbook  multigrid  convergence 
rate  through  increasing  the  work  and  memory  requirements  per  cycle  is  the  first  step  to  achieving  textbook 
multigrid  efficiency  (TME).  Opposed  to  TMC  the  TME,  defined  by  Brandt  in  [1],  fixes  the  computational 
work  to  solve  the  problem  to  ten  or  less  work  units. 

This  paper  is  organized  as  follows:  The  numerical  scheme  used  by  our  simulations  is  described  in  Section 
2.  Details  of  the  implementation  of  the  multigrid  algorithm  used  in  this  work  will  be  presented  in  Section  3. 
Numerical  results  are  obtained  in  Section  4  for  two  common  benchmarks  in  CFD;  the  driven  cavity  and  the 
flow  over  a  flat  plate.  In  this  section,  the  robustness  of  the  alternating-plane  smoothers  combined  with  full 
coarsening  and  plane  smoothers  combined  with  semi-coarsening  will  be  investigated.  The  paper  ends  with 
some  conclusions  in  Section  5. 


2.  The  Primitive  Equations.  The  dimensionless  steady-state  incompressible  Navier-Stokes  equations 
in  the  absence  of  body  forces  may  be  written  as: 


(2.1) 


(u-  V)u=  -Vp-b 


Re 


An, 


V  •  u  =  0, 


where  u  €  5ft3  =  (u,v,  w)  is  the  non  dimensional  velocity  field  and  p  is  the  dimensionless  pressure.  Re  is  the 
Reynolds  number  defined  as  Re  =  where  U0 0  is  a  characteristic  velocity,  L  a  characteristic  length  and 

v  the  kinematic  viscosity. 

2.1.  Discretization.  In  order  to  obtain  the  discrete  expression  of  the  non-linear  system  (2.1),  the 
solution  domain  is  divided  into  a  finite  set  of  control-volumes  (CV).  In  the  present  work  we  will  use  an 
orthogonally  structured  grid  where  each  control  volume  will  be  an  hexahedron,  as  in  left-hand  chart  in  figure 
2.1.  The  variables  are  stored  in  a  staggered  way,  i.e.  the  velocities  are  evaluated  in  the  faces  of  the  CV  and 
the  pressure  field  at  the  center  of  each  CV.  Staggered  discretization  has  the  benefits  of  stability  properties 
and  leads  to  a  natural  discrete  form  of  the  continuity  equation  [7,  9]. 


z 


FIG.  2.1.  Placement  of  the  unknowns  in  the  CV  (left-hand  chart).  Control  Volume  where  the  u-momentum  equation  is 
integrated  (right-hand  chart). 


The  procedure  carried  out  to  discretize  the  u  momentum  equation 


will  now  be  described  with  some  detail. 
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In  a  staggered  arrangement  of  unknowns  each  equation  is  integrated  in  its  own  CV.  The  u-momentum  CV 
is  built  surrounding  the  uijk  variable,  displaced  from  the  CV  of  the  continuity  equation,  as  in  right-hand 
chart  in  figure  2.1.  In  the  following,  we  will  refer  to  the  dimensions  of  this  CV  Q  as  AX,  AY,  AZ.  So  we 
can  write  the  u-momentum  equation  for  a  generic  node  Uijk  in  its  integral  form  as  : 


uVu  dV 


-l 


v-dv+-kL 


A  u  dV. 


The  convective  term  in  the  momentum  equation  using  the  Gauss  theorem  is  rewritten  as: 


[  u VudV  =  [  ii(u  ■  n)  dS  =  V  / 

J  Q  J  i,  d  dC 


e,w,s,n,t,b. 


The  last  integral  in  (2.3)  is  easily  approximated  applying  the  midpoint  rule.  Providing  the  value  of  the 
function  in  the  middle  of  the  face  results  in  the  leading  truncation  term  being  <D{h2).  Thus,  to  preserve  this 
accuracy  the  interpolation  of  the  fluxes  at  the  CV  faces  has  to  be  at  least  of  second-order.  This  is  assured  by 
using  a  parabolic  interpolation  for  the  velocities  and  linear  interpolation  for  the  mass  fluxes.  Moreover,  for 
non-uniform  grids  the  fluxes  are  not  computed  at  the  middle  of  the  CV.  So,  assuming  a  stretched  geometric 
grid  of  the  form  hk+ 1  =  0hk  with  0  as  the  grid  stretching  factor,  the  integral  approximation  will  have  a 
truncation  error  O((0  -  l)/i)  +  <D(h?).  Taking  these  considerations  into  account  integral  (2.3)  is  written  as: 

(2.4)  V)/  u(u-n)dS «  \]m,kuk,  k  =  e,w,s,n,t,b; 

T  JdQ*  k 

where  the  mass  fluxes  mk  have  been  defined  as  fdQ,  u  ■  n  dS  and  can  be  evaluated  with  the  following 


expressions: 


me  =  Uiik  ±p^AYAZ, 


mw  =  Uiik  +^i+ljk-  AY  AZ, 


1VijkAXi  Wi—ijkAXi^-\ 


AY,  mt  - 


VijkAXi  +  Vi— ljkA.Xi+1  .  Vij-\-lk  Axt'  4*  Vi—ij+ik  A^i+l  A  ry 

ms  -  — - - — - - A Z,  mn  = - - - 

-f-  AX{+i  A/.  Wijk+1  Axj  +  ljfc+1  A^i+1  A  v 

mb  =  — - - — - AT,  mt  - : - ^ ’ 

with  Axi+1  =  xi+ljk  -  Xijk  and  Ax*  =  xijk  -  x{- ljk.  The  velocity  at  the  CV  face  is  interpolated  by  fitting 
a  parabola  to  the  values  of  the  velocity  at  three  consecutive  nodes:  the  two  nodes  located  on  either  side 
of  the  surface  of  interest,  plus  the  adjacent  node  in  the  upstream  direction.  In  this  work  we  will  use  the 
QUICK  formulation  of  Hayase  et  al.  [8],  that  can  be  seen  as  a  defect-correction  scheme  based  on  the  upwind 
difference  approximation  : 


f  uijk+S+{  u-n)e>0,  ^  (  uijk+S+  (u.n)„  >  0,  ^  I  uijk  +  S+  (u  •  n)„  >  0, 

Ue~\  ui+ijk  +  S~  (u  •  n)e  <0,  \  Ui-ii*  +  S~  (u  •  n)w  <0,  \  uij+lk  +  S~  (u  •  n)n  <  0 

_  f  Uijk  +  5+  (u  -  n)s  >0,  u  —  \  U^k  ^  ub  —  I  Ui^k  (u  ‘  n)6  > 

8  |  Uij- ik  +  S~  (u  •  n)s  <0,  [  Uijk+i  +  St  (u  •  n)t  <0,  \  +  Sb  (u  •  n)b  <  0, 

The  defect-correction  source  terms  S+  and  S~  are  calculated  within  the  multigrid  cycle  using  the  current 
approximation  whenever  a  discrete  evaluation  of  the  residual  is  needed.  So  the  algebraic  coefficients  for  the 
convection  terms  can  be  written  as: 

Lce  =  min(0,me),  Lcw  =  mm{0,mw),  Lcn  —  min(0,mn), 

Lcs  —  min(0, m5),  L\  =  min(0 ,mt),  Lcb  =  min(0,m&), 

(2.5)  Li  =  -(Tg  +  Lcn  +  L\  +  Lcb  +  L\  +  Lcw ). 


'U'ijk  d-  ^  fij 

'U'ij+lk  d”  (**  ‘  **)«  ^ 
Uijk  d“  (u  *  11)5  >  0, 
'Uijk—  l  +  Sb  (u  ’  Xl)b  < 
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The  expression  for  Lcp  has  been  obtained  using  the  continuity  equation  over  the  CV  H,  which  in  its 


discrete  form  is: 


ne  4  m,w  4-  mn  4  rn,s  4  mt  4-  m&  =  0. 


Using  the  Gauss  theorem  in  the  diffusive  part  of  the  momentum  equation  (2.3)  and  the  midpoint  rule 
to  approximate  the  resulting  surface  integral  we  get: 


(2-6lVU  n-~LV^e  \dxJJ~*  '  l \dy)n  \dyJJ-~ »  '  L \dzjt  \dzjb j  — 

The  derivatives  in  the  above  expression  are  evaluated  with  a  central  difference  scheme  : 

L„  _  AFAZ  Ld  _  AT  A Z  £d  _  AXA Z 

e  Re(xi+ijk  -  Xijk)  ’  w  Re{xijk  -  Xi-ijk)’  n  Re(yij+ik  -  Vij-ik)  ’ 

Ld  _  AXAZ  Ld  _  AT  AX  AFAX 

*  Re(yij+2k  ~  Vijk)  R&(Zijk+ 1  ~  Zijk—l)  Re{Zijk+2  ~  zijk) 

(2.7)  l2  =  -(lS  +  l£  +  l?  +  l?  +  l?  +  l£). 

Finally  treating  the  pressure  as  a  surface  force  the  volume  integral  in  (2.2)  can  be  expressed  as  a 
surface  integral,  as  in  (2.8).  Again  this  is  evaluated  using  the  midpoint  rule  approximation,  in  this  case  no 
interpolation  is  needed  for  the  pressure  due  to  the  staggered  arrangements  of  unknowns  as  can  be  seen  in 
right-hand  chart  in  figure  2.1. 


A  Sx  + 


■  f  pi  ■  ndS  «  (pw  -pe)  ASe  i  =  (1,0,0). 
Jon 


Now,  we  can  write  the  algebraic  equation  for  a  generic  velocity  node  Uijk  as: 

(2.9)  L^Ui-ijk  4-  L™Ui+ijk  4*  L^Uij-ik  4  L™Uijk+ 1  4  L^uijk-i  4  LpUijk  4  L^pijk  4  L^pi- ijk  =  F{jk. 

The  coefficients  multiplying  the  velocity  u  are  obtained  as  the  sum  of  the  diffusive  and  convective  part,i.e. 
Lf  =  L\  4  Lf  with  l  =  e,w,n,6,M  arid  tll0se  multiplying  the  pressure  are  obtained  directly  from  (2.8). 
An  Equivalent  expression  may  be  obtained  for  the  v  and  w  momentum  equation  and  can  be  derived  by 
symmetry  from  the  above  equations. 

The  continuity  equation  can  be  easily  approximated  due  to  the  fact  that  all  velocities  are  known  within 
the  surface  of  the  volume. 

(2.10)  [  V  •  udV  «  (ue  -  uw)AY AZ  4  (vn  —  vs)AXAZ  4  (wt  -  wb)AXAY. 

Jn 

The  above  expressions  are  valid  for  CV’s  inside  the  domain  and  must  be  modified  in  order  to  satisfy  the 
boundary  conditions.  The  discretization  of  the  boundary  conditions  are  performed  by  mirroring  the  cells 
adjacent  to  the  boundary.  The  new  variables  outside  the  solution  domain  are  extrapolated  invoking  the 
boundary  condition  at  each  boundary.  With  these  modifications  of  the  algebraic  equations  the  system  of 
non-linear  equations  to  be  solved  can  be  presented  in  a  matrix  form  as: 

/z£  0  0  z£  \  /  II  \  f  fu\ 

(o  id  0  L'i  0  L*  v  U 

[iU)  0  0  L*  L*  w  fw  ’ 

Lhm  Lhm  Ll  0  ;  V  P  /  U  / 

where  the  source  terms  /u,  /v,  fw  and  fp  in  the  right-hand  side  of  the  system  (2.11)  include  the  discretization 
of  the  boundary  conditions  and  the  contribution  of  the  QUICK  scheme. 
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3.  The  Multigrid  Method.  A  sequence  of  grids  Cl1  (l  =  1, ...,  M)  is  used  in  the  full  multigrid  (FMG) 
scheme  [1]  where  Q1  is  the  finest  target  grid  and  the  rest  of  the  grids  are  obtained  by  applying  cell-centered 
coarsening.  The  computations  are  initiated  in  the  coarsest  grid,  once  the  discrete  system  is  solved  the 
solution  is  transferred  to  the  next  finer  level.  The  prolongated  solution  is  then  used  as  an  initial  guess  for 
the  multigrid  method  in  that  level,  this  procedure  is  repeated  until  the  finest  grid  is  reached.  The  goal  of 
this  algorithm  is  to  reduce  the  algebraic  error  to  below  the  discretization  error  in  just  one  FMG  cycle. 

Due  to  the  non-linearity  of  the  problem  a  Full  Approximation  Scheme  (FAS)  [1]  is  used  to  solve  each  level 
in  the  FMG  cycle.  The  following  iterative  algorithm  represents  a  FAS  V(7i, 72) -cycle  to  solve  the  nonlinear 
system  Lu  =  /  on  fi1  where  71  and  72  represent  the  number  of  pre-smoothing  and  post-smoothing  iterations 
respectively: 

step  1  ( Pre-smoothing  ):  Apply  71  iterations  of  the  smoothing  method  to  Llul  =  fl 

FOR  1=1  TO  L  Restriction  Part 

step  2:  Compute  the  residual  rl~l  —  fl~l  -  Ll~lul~l 

step  3:  Restriction  of  the  residual  rl  =  R\_xrl~l 

step  4:  Restriction  of  the  solution  ulold  =  I^u1"1 

step  5:  Compute  the  metrics  of  level  I  Ll(ulold) 

step  6:  Calculate  the  new  right-hand  side  fl  =  rl  4-  Llulold 

step  7  (Pre-smoothing  ):  Apply  71  iterations  of  the  smoothing  method  to  Llul  =  fl 
FOR  l=L-l  TO  1  Prolongation  Part 

step  8:  Correction  of  the  current  approximation  ul  -  ul  +  P/+1(u*+1  -  u^1) 
step  9:  Compute  the  metrics  of  level  1  Ll(ul) 

step  10  ( Post-smoothing  ):  Apply  72  iterations  of  the  smoothing  method  to  Llul  =  fl 

In  steps  5  and  9  the  metrics  of  the  system  are  computed  over  the  current  grid,  which  includes  the 
computation  of  the  correction  terms  from  the  QUICK  scheme  and  also  the  linearization  of  the  system  based 
on  the  actual  solution.  Note  that  the  metrics  of  the  system  are  also  updated  within  the  smoothing  process 
in  steps  7  and  10  as  explained  subsequently  in  Section  3.1.  The  operators  R\_i  and  P/+1  in  steps  3  and  8 
are  used  to  transfer  data  (solution  and  residuals)  between  two  different  grids;  from  the  coarser  level  to  the 
current  (prolongation)  or  from  the  finer  to  the  current  level  (restriction),  respectively. 

These  transfer  operators  are  dictated  by  the  staggered  arrangement  of  unknowns  and  the  coarsening 
procedure  used.  The  prolongation  and  restriction  operators  are  volume-weighted  trilinear  interpolation  in 
the  case  of  standard  coarsening.  For  the  semi-coarsening  approach  the  velocity  component  parallel  to  the 
coarsened  direction  is  restricted  using  injection,  the  other  variables  are  restricted  using  volume-weighted 
linear  interpolation  in  lines  parallel  to  the  coarsened  direction.  The  prolongation  operator  in  this  case  is 
volume-weighted  linear  interpolation.  Note  that  for  semi-coarsening  the  velocity  component  parallel  to  the 
coarsened  direction  is  treated  in  a  vertex- centered  way,  while  the  rest  of  the  variables  are  transferred  as 
cell-centered. 

In  the  following  experiments  F-cycles  will  be  used  (see  figure  3.1)  to  solve  each  level  of  the  FMG 
algorithm.  F-cycles  have  been  reported  to  be  more  efficient  for  rotating  problems  [15]  at  the  expense  of  their 
parallel  properties  [4].  The  coarsest  level  is  fixed  as  coarse  as  possible,  and  it  will  be  solved  with  5  iterations 
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of  the  smoothing  process. 


Fig.  3.1.  Scheme  of  an  F-cycle  F( '^1,^2)  where  7O  represents  the  number  of  iterations  of  the  smoother  performed  to  solve 
the  coarsest  level 

3.1.  Smoothing  Process.  One  of  the  most  important  parts  of  a  multigrid  algorithm  is  the  smoothing 
process.  Several  smoothers  for  the  Navier-Stokes  equations  problem  were  studied  in  literature.  These  ap¬ 
proaches  fall  into  two  categories:  (1)  coupled  smoothing  [22,  17,  21]  (where  the  momentum  and  continuity 
equation  are  satisfied  simultaneously),  and  (2)  distributive  smoothing  [20,  3]  (where  the  momentum  equa¬ 
tions  are  solved  in  a  first  step,  and  then  the  velocities  and  pressures  are  corrected  in  order  to  satisfy  the 
continuity  equation).  In  situations  where  the  coefficients  vary  through  the  control  volume  (e.g.  stretched 
grids,  strong  recirculating  flows,...)  coupled  smoothing  has  advantages  over  the  distributive  approach  be¬ 
cause  the  linearized  momentum  and  continuity  equations  are  solved  simultaneously  [22,  9].  However  the 
computational  cost  of  the  coupled  method  is  much  higher  than  that  of  the  distributive.  Note  that  a  (small) 
matrix  has  to  be  inverted  in  each  CV.  Moreover,  every  velocity  component  is  updated  essentially  twice  since 
it  updates  all  the  variables  involved  in  a  CV  simultaneously  (see  right-hand  chart  in  figure  2.1). 

In  particular,  we  have  chosen  a  cell-implicit  Symmetric  Coupled  Gauss  Seidel  (SCGS)  method  as  the 
base  smoother  because  of  its  higher  stability  and  rapid  convergence.  This  smoother  was  introduced  by  Vanka 
[22]  and  previously  considered  in  other  work  [21].  Considering  the  CV  ijk ,  the  momentum  equations  for  the 
six  cell  faces  together  with  the  continuity  equation  for  the  control- volume  can  be  expressed  as: 

L™+m,j+n,k+pUi+mi3+n>k+P  Lp™Pijk  +  Lp^Pi-ijk  =  fijki 

|m|+|n|+ |p|<l 

L™+m,j+n,k+pUi+™J+n>k+P  ^  LpiPi+ljh  Lp'^pijk  ~  fi+ljki 

\m\+\n\  +  \p\<l 

E  IJi+m,j+n,k+pVi+rnJj+n,k+p  +  Lp*Pijk  +  I^^Pij-lk  =  fijk’ 

|m|+|n|+|p|<l 

E  +  ^p^Pij+1*  +  ^p^-yPijk  —  /ij+lfcj 

|m|  +  |n|  +  |p|<l 

E  L™+mj+ntk+pWi+rn,j+n,k+p  +  LpkPijh  +  Lp^^ijk-1  ~  fijki 
|m|+|n|-f  |p|<l 

E  ^!+m)i+n,Jc+pU,*+m'i+n>fc+P  +  LplPijk+l  +  Lvl_xpijk  —  fijk+ 1j 
|m|+in|+|p!<l 

(ui+ijk  —  Uijk)&Y A Z  +  (vij+ik  —  Vijk)AX AZ  4-  (wijk+i  —  Wijk)AX AY  =  f™k. 

(3.1) 

This  set  of  equations  for  the  CV  is  linearized  by  computing  the  mass  fluxes,  Lu'v'w,  with  the  current 
values  of  the  velocity  field.  Defining  the  residuals  ru'v’w  and  the  corrections  A u  =  wn+1  -un,  etc.  the  system 
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(3.1)  can  be  arranged  in  a  block  structure  as  follows: 


TUW 

Lijk 

0 

0 

0 

0 

0 

Lp?  \ 

(  Auijk  \ 

/  TU 

'  rijk 
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r  ue 
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0 

0 

0 

^H'i+ljk 

ru 

0 

0 

r  vs 
■k ijk 
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0 

0 

Ln 

Avijk 

rv 

Tijk 

0 

0 

0 

rvn 

Ifc 

0 

0 

Avij+ik 
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rv 

1  ij+lk 

0 

0 

0 

0 

T  wb 

L ijk 

0 

A  wijk 

rw 

rijk 

0 

0 

0 

0 

0 

rwt 
Ljijk+ 1 

Lvt-i 

T 

•Sc 

<1 

rw 

1  ijk+ 1 

-AY  AZ 

AY  AZ 

-AXA  Z 

AXAZ 

-AXAY 

A  A  AY 

0  1 

\  A pijk  / 

V  r?jk 

/ 

(3.2) 

A  more  implicit  version  of  the  system  (3.2)  that  includes  off-diagonal  elements  in  the  first  six  rows  is 
also  possible,  this  is  equivalent  to  considering  implicitly  in  the  equations  (3.1)  all  the  unknowns  involved  in 
the  control-volume.  However  the  convergence  factor  is  similar  and  the  system  is  more  expensive  to  solve 
than  the  system  of  equations  (3.2)  [21].  The  system  (3.2)  is  easily  solved  by  Gaussian  elimination  and  then 
the  velocity  components  and  the  pressure  of  the  CV  are  updated  using  under-relaxation: 

un+1  =  un  4-  cauAu, 

(3.3)  Pn+1  -pn  +  uvAp. 

The  under-relaxation  technique  has  the  effect  of  adding  a  pseudo-time  dependent  term  in  the  equations. 
In  the  following  simulations  the  under-relaxation  factor  for  the  pressure,  u)p ,  has  been  fixed  to  1.0,  while  the 
under-relaxation  factor  for  the  velocities,  vu,  is  strongly  problem  dependent  and  has  to  be  set  empirically. 
The  optimum  value  of  uu  is  a  function  of  the  Reynolds  number,  the  grid  size  and  also  depends  on  whether  the 
convection  scheme  is  first  (upwind)  or  second  order  (QUICK)  accurate.  This  is  a  drawback  of  this  smoother, 
since  a  simulation  has  to  be  tuned  in  order  to  find  out  the  best  under-relaxation  factor.  As  figure  3.2  shows 
the  efficiency  of  the  method  can  be  dramatically  worsened  with  a  bad  choice  of  uiu. 


Fig.  3.2.  Number  of  fine  grid  cycles  required  to  converge  a  driven  cavity  simulation  with  Re— 500  on  a  16x16x16  uniform 
grid  as  a  function  of  the  relaxation  parameter  cau  • 

3.2.  Plane  Implicit  Smoothers.  Implicit  solvers  have  been  widely  considered  in  previous  work  as  a 
cure  to  eliminating  all  the  high-frequency  errors  in  the  presence  of  strong  anisotropies.  Taking  advantage  of 
the  relatively  small  1-D  problem  size,  these  implicit  line  smoothers  are  based  on  an  exact  solver.  However 
the  3-D  counterpart  does  not  present  this  possibility,  since  the  2-D  problem  size  is  no  longer  small  enough 
to  consider  using  an  exact  solver.  Furthermore  a  direct  exact  solver  for  the  planes  is  not  needed,  as  has 
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been  shown  in  [10]  for  the  3-D  Poisson  equation  and  in  [17]  for  the  incompressible  Navier-Stokes  equations. 
This  consideration  drastically  reduces  the  computational  cost  of  the  overall  algorithm  compared  to  that  of 
an  “exact”  plane  solver.  However  this  inexact  solution  of  the  planes  does  not  decrease  the  convergence  of 
the  multigrid  algorithm  [17,  10],  i.e.  solving  the  plane  beyond  a  precision  threshold  does  not  improve  the 
convergence  rate.  Note  that  the  plane  implicit  smoother  has  to  damp  high  oscillating  error  components  in 
the  plane  rather  than  solving  a  2-D  problem  exactly. 

In  the  present  work,  the  planes  will  be  approximately  solved  with  a  2-D  multigrid  algorithm  consisting 
of  one  FAS  F(l,l)  cycle.  The  same  kind  of  anisotropies  found  in  the  3-D  problem  may  appear  in  the  2-D 
system.  Thus  a  robust  multigrid  algorithm  is,  again,  completely  necessary.  For  the  2-D  system  the  same 
robust  algorithms  will  be  considered,  i.e.  an  alternating-line  smoother  combined  with  full  coarsening  and  a 
line-implicit  smoother  combined  with  semi-coarsening.  One  1-D  FAS  F(l,l)  cycle  will  be  applied  to  solve 
the  lines,  the  smoother  in  this  case  being  SCGS  as  described  in  the  previous  section. 

The  coupled  philosophy  of  the  SCGS  will  be  applied  in  the  line  and  plane  solvers.  The  plane  smoother 
relaxes  simultaneously  the  momentum  and  continuity  equations  of  the  cells  included  in  the  plane,  and  hence 
all  velocity  components  and  pressures  contained  within  the  plane  will  be  updated  at  the  same  time.  Let  us 
consider  for  example  an  xy  plane,  defining  the  vector  Xk  that  accommodates  the  variables  for  a  whole  plane 
of  cells: 

Xk  =  (u,  v,  w,  w+,p),u  =  Uijk,  v  =  VijkiW  =  Wijk,  w+  =  Wijk+ i,P  =  Pijk,  Vi,  j  6  [0,n]  k  =  const . 

The  equation  system  for  the  plane  in  terms  of  residuals  and  corrections  is: 

(3.4)  Lk  AXk  =  Rk} 

where  Rk  =  fk  -  LkXk  is  the  residual  of  the  kth  plane  and  AXk  =  X£+1  -  X£  is  the  increment  of  the 
solution.  The  system  of  equations  (3.4)  is  built  into  the  smoothing  process  as  follows.  When  solving  the  kth 
plane,  the  metrics  in  that  plane  are  linearized  using  the  current  solution.  Also  the  second-order  correction  for 
the  convective  term  is  recomputed.  With  these  new  metrics  the  residual  Rk  for  the  kih  plane  is  calculated. 

Defining  a  specific  ordering  of  the  planes,  many  types  of  plane  smoothers  can  be  easily  constructed.  It 
is  important  to  note  that  with  the  second  order  operator,  the  right-hand  side  of  the  system  (3.4)  depends 
on  the  values  of  the  plane  k,k±  1  and/or  k  ±  2  depending  on  the  direction  of  the  velocity.  Thus  a  parallel 
implementation  can  not  be  constructed  based  on  a  regular  zebra  ordering  (left-hand  chart  in  figure  3.3).  In 
order  to  avoid  these  dependencies  a  tri-plane  smoother  could  be  applied  [15]  (right-hand  chart  in  figure  3.3) 

Grid  stretching  is  commonly  used  in  grid  generation  to  pack  points  into  regions  with  large  solution 
gradients  while  avoiding  an  excess  of  points  in  more  benign  regions  (for  example  in  the  simulation  of  viscous 
flows  at  high  Reynolds  number  to  resolve  boundary  layers).  The  convergence  of  multigrid  based  on  point 
smoothing  and  full  coarsening  deteriorates  dramatically  when  highly  stretched  grids  are  used.  In  some 
situations,  when  the  direction  of  the  anisotropies  is  known  beforehand,  the  multigrid  convergence  can  be 
improved  using  an  implicit  smoother  in  the  direction  normal  to  the  stretching.  However,  if  the  stretched 
grid  generates  aspect  ratios  whose  relative  magnitudes  vary  for  different  parts  of  the  computational  domain 
the  multigrid  techniques  based  on  plane- wise  smoothers  combined  with  full  coarsening  fail  to  smooth  error 
components.  Other  remedies  should  be  used  to  achieve  a  robust  solver,  the  two  most  common  alternatives 
being: 

•  Robust  multigrid  smoothing  process  with  standard  coarsening.  If  the  coarser  grids  are  built  by  dou¬ 
bling  the  mesh  size  in  all  coordinates  direction,  sweeps  of  the  planes  in  the  three  directions  are 


9 


O  First  Sweep 
X  Second  Sweep 


□  Second  Sweep 
X  Third  Sweep 


Fig.  3.3.  Standard  zebra  ordering  of  planes  (left-hand  chart).  Planes  relaxed  concurrently  in  a  tri-plane  smoother  (Right- 
hand  chart). 


needed  to  achieve  robustness  ((y,z)-plane  smoothing  sweep  (x,z)-plane  smoothing  sweep  -¥  (x,y)- 
plane  smoothing  sweep).  From  here  on,  this  approach  will  be  referred  as  alternating-plane  smoothers 
(APS).  Several  versions  of  this  method  can  be  developed  depending  on  the  sweep  ordering:  sym¬ 
metric  alternating-plane  smoother  (S-APS),  lexicographic  alternating-plane  smoother  (L-APS)  and 
tri-plane  alternating-plane  smoother  (Tri-APS). 

•  Plane  implicit  smoothers  combined  with  semi- coarsening.  Instead  of  using  standard  coarsening  the 
coarse  levels  can  be  built  by  only  coarsening  along  one  direction.  In  order  to  achieve  robustness  a 
plane  implicit  solver  perpendicular  to  the  coarsened  direction  is  needed.  Based  on  the  coarsened  di¬ 
rection,  we  will  refer  to  these  approaches  as  X,  Y  or  Z  semi-coarsening  (XSC,  YSC,  ZSC).  Depending 
on  the  order  in  which  the  planes  are  swept  we  can  construct  the  following  methods:  symmetric  Z 
semi-coarsening  (S-ZSC),  lexicographic  Z  semi-coarsening(L-ZSC)  and  tri-plane  Z  semi-coarsening 
(Tri-ZSC). 

4.  Numerical  Experiments.  Two  different  flows  have  been  chosen  to  test  the  robustness  of  the 
multigrid  algorithms  described  previously:  the  driven  cavity  and  the  flow  over  a  flat  plate.  These  two  cases 
have  been  widely  studied  and  used  as  benchmarking  problems  for  CFD  codes.  Although  the  flow  structure 
are  relatively  simple,  they  exhibit  some  basic  problems  that  prevent  optimal  multigrid  efficiencies  from  being 
achieved  [2],  namely  strong  recirculating  flows  and  boundary  layers. 

Let  R  be  the  1,2-norm  of  the  average  residual  of  the  system  of  equations  (2.11)  defined  as: 


(4.1) 


R  = 


E(  Wa  +  CR?jfc)2  +  (-R^)2  +  (gW!) 

4 -Nx-Nu-Nz 


where  Ru,  Rv,  Rw,  Rc  are  the  residuals  of  the  u,  v,  w  momentum  equations  and  continuity  equation 
respectively.  The  convergence  criterion  is  based  on  R.  When  the  fine  grid  average  residual  decreases  to 
below  10-4  the  calculations  are  terminated.  This  value  is  small  enough  to  assure  that  the  algebraic  error 
is  below  to  the  discretization  error.  Let  R0  and  Rn  denote  respectively  the  residual  norms  (as  defined  in 
4.1)  before  the  iterative  process  and  after  the  convergence  criterion  is  satisfied.  So  the  average  convergence 
factor  is  defined  by: 

(4.2)  *=<£>*• 

4.1.  Flow  in  a  Driven  Cavity.  The  numerical  solution,  which  has  been  widely  used  for  testing 
numerical  schemes,  is  that  of  a  flow  confined  in  a  rectangular  domain  with  the  upper  wall  moving  at  a 
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constant  speed.  The  flow  structure  for  low  to  moderate  Reynolds  numbers  consists  of  a  3-D  primary  vortex 
and  two  3-D  secondary  vortexes  at  the  bottom.  The  problem  currently  considered  consists  of  a  cube  of 
dimension  L  with  the  top  wall  moving  at  a  velocity  u.  The  Reynolds  number  of  the  flow  based  in  these 
quantities  is  Re  =  The  boundary  conditions  are  of  Dirichlet  type  for  the  velocities  on  the  six  faces  of 
the  computational  domain,  and  no  boundary  conditions  were  necessary  for  the  pressure. 


Fig.  4.1.  Grid  used  for  the  driven- cavity  problem  32x32x32  (left-hand  chart),  and  structure  of  the  3D  primary  vortex  for 
Re= 103  (right-hand  chart). 

Simulations  have  been  performed  over  three  different  grids,  each  one  uniform  and  stretched  :  16x16x16, 
32x32x32  and  64x64x64.  The  stretched  grids  were  of  the  form  hk+1  =  fihk,  the  stretching  factor  being  (3 
equal  to  1.1  in  all  cases  (see  figure  4.1  left-hand  chart).  The  driven  cavity  problem  is  a  rotating  flow  for  which 
standard  multigrid  schemes  might  have  difficulties  to  converge.  These  difficulties  were  not  experienced  in 
this  work  since  a  moderate  Reynolds  number  was  considered.  However,  the  simulations  result  in  a  complex 
re-circulating  flow  consisting  of  3-D  vortex  structures  as  can  be  seen  in  figure  4.1.  The  flow  field  is  in  good 
qualitative  agreement  with  previous  flow  calculations  [23],  The  profiles  at  the  center  line  of  the  cavity  with 
Re=103  on  a  32x32x32  stretched  grid  can  be  seen  in  figure  4.2. 


Fig.  4.2.  The  u  velocity  component  profile  at  the  center  line  of  the  cavity  with  Re= 103  for  first-order  and  second-order 
accurate  solutions  over  a  32x32x32  streched  grid. 

As  mentioned  in  section  3.1  the  convergence  factor  varies  with  the  choice  of  the  under-relaxation  factor 
for  the  velocity  field.  In  table  4.1  the  under-relaxation  factors  used  in  each  simulation,  as  a  function  of  the 
Reynolds  number  and  grid  size,  are  shown  for  the  different  multigrid  cycle  under  study:  symmetric  SCGS, 
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tri-plane  and  symmetric  APS,  and  tri-plane  and  symmetric  ZSC. 

Table  4.1 

Under-relaxation  factors  for  the  driven-cavity  simulation  as  a  function  of  the  grid  size,  Reynolds  number  and  multigrid 
cycle. 


Grid 

Reynolds  Number 

16x16x16 

32x32x32 

64x64x64 

SCGS 

APS 

ZSC 

SCGS 

APS 

ZSC 

SCGS 

APS 

ZSC 

102 

0.6 

0.6 

0.5 

0.5 

0.6 

0.6 

0.5 

0.6 

0.5 

103 

0.4 

0.3 

0.4 

0.3 

0.3 

0.3 

0.3 

0.3 

0.3 

Figure  4.3  shows  the  L2-norm  of  the  residual  versus  F(l,l)-cycles  with  a  symmetric  SCGS  smoother  for 
several  uniform  grids  and  Reynolds  numbers.  The  behavior  of  this  smoother  is  quite  good  for  low  Reynolds 
numbers  (left-hand  graph  in  figure  4.3  for  Reynolds  number  102),  the  residual  norm  is  reduced  by  between 
four  and  five  orders  of  magnitude  in  the  first  five  cycles.  However  its  efficiency  decreases  as  the  problem 
becomes  more  convective.  The  residual  norm  can  not  be  reduced  by  four  orders  of  magnitude  in  ten  cycles 
(right-hand  graph  in  figure  4.3  for  Reynolds  number  103).  Furthermore,  convergence  could  not  be  attained 
over  stretched  grids  with  a  cell- wise  smoother  like  symmetric  SCGS. 


FlG.  4.3.  Li-norm  of  the  residual  versus  F(l,l)-cycles  with  symmetric  SCGS  smoother  for  several  uniform  grids  and 
Reynolds  numbers  for  the  driven-cavity  simulation. 

Figure  4.4  shows  the  L2-norm  of  the  residual  versus  F(l,l)-cycles  with  an  alternating-plane  smoother 
combined  with  full  coarsening  (APS)  for  several  grids  and  Reynolds  numbers.  The  symmetric  APS  approach 
(top  graphs  in  figure  4,4)  converges  the  residual  to  below  10-4  in  five  cycles  for  both  Reynolds  numbers 
(102  and  103).  However,  the  tri-plane  APS  approach  (bottom  graphs  in  figure  4.4)  need  eight  cycles  to 
reduce  the  residual  norm  to  below  10-4  for  Reynolds  number  103.  It  is  interesting  to  note  that  the  cost 
per  multigrid  cycle  with  the  symmetric  ordering  is  about  twice  as  large  as  that  with  the  tri-plane  ordering. 
However  convergence  factor  per  work  unit  is  better  with  the  symmetric  ordering  of  planes  for  low  Reynolds 
numbers  and  it  is  similar  for  both  smoothers  for  Re  =  103. 

One  of  the  drawbacks  of  the  APS  approach  is  its  difficult  implementation  in  a  parallel  setting  [13].  This 
problem  can  be  easily  overcome  using  a  plane  smoother  combined  with  semi-coarsening  to  ensure  robustness. 
The  block  implicit  smoother  used  to  converge  the  driven  cavity  simulation  needs  to  be  applied  along  the 
sub-characteristics  of  the  discrete  operator  for  convection  dominated  problems  in  order  to  obtain  the  higher 
efficiency.  For  example,  it  was  observed  that  the  xz-plane  sweeps  seriously  harm  the  smoothing,  and  so  the 
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Fig.  4.4.  L2-norm  of  the  residual  versus  F(l,l)-cycles  with  an  alternating-plane  smoother  combined  with  full  coarsening 
(APS)  for  several  grids  and  Reynolds  numbers  with  a  symmetric  order  of  planes  (top  graphs)  and  a  tri-plane  ordering  (low 
graphs)  for  the  driven  cavity  simulation. 


Y  semi-coarsening  approach  exhibits  a  poor  behavior. 

Figure  4.5  shows  the  L2-norm  of  the  residual  versus  F(l,l)-cycles  with  an  xy-plane  implicit  smoother 
combined  with  Z  semi-coarsening  (ZSC)  for  several  grids  and  Reynolds  numbers.  Although  not  shown,  the 
behavior  exhibited  by  the  XSC  approach  is  similar  to  the  one  presented  in  the  set  of  figures  4.5  for  ZSC.  The 
symmetric  ZSC  approach  (top  graphs  in  figure  4.5)  converges  the  residuals  to  below  10“4  in  five  cycles  for 
both  Reynolds  numbers  (102  and  103).  However,  the  tri-plane  ZSC  approach  (bottom-charts  in  figure  4.5) 
is  not  able  to  reduce  the  residual  norm  below  10“4  in  ten  cycles  for  Reynolds  number  103.  As  one  might 
expect,  the  time  per  cycle  is  twice  as  fast  for  the  tri-plane  ordering,  although  the  convergence  factor  per 
work  unit  is  better  with  the  symmetric  ordering  for  all  the  cases. 

Table  4.2  shows  the  average  convergence  factors  obtained  in  the  simulation  of  the  driven  cavity  problem 
for  several  uniform  and  stretched  grids,  Reynolds  numbers  and  F(l,l)  cycles.  The  convergence  factor  has  been 
proved  to  be  independent  of  the  grid  size  and  stretching  for  the  two  robust  approaches  investigated,  however 
the  convergence  is  not  Reynolds  number  independent  for  the  driven  cavity  simulation.  The  algorithms 
exhibit  the  same  behavior  for  low  Reynolds  numbers  as  when  solving  the  Poisson  equation,  i.e.  the  residual 
reduction  per  cycle  is  similar  in  both  situations  [10].  The  convergence  factor  improves  on  stretched  grids  (as 
in  the  fully  elliptic  case  [10]),  and  as  shown  in  table  4.2,  it  also  improves  for  finer  grids.  The  convergence 
factor  for  the  APS  approach  is  lower  than  for  the  SC  approach  and  its  cost  per  cycle  is  twice  as  low  because 
the  F-cycle  spends  a  lot  of  time  on  coarser  levels.  However  its  difficult  and  low-efficiency  parallelization  and 
its  difficulty  to  converge  for  Reynolds  numbers  higher  than  103  might  make  the  semi-coarsening  approach 
more  attractive. 
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Fig.  4.5.  L'l-norm  of  the  average  residual  versus  F(l,l)-cycles  with  an  xy-plane  implicit  smoother  combined  with  z  semi¬ 
coarsening  (ZSC)  for  several  grids  and  Reynolds  numbers  with  a  symmetric  order  of  planes  (top  graphs)  and  a  tri-plane 
ordering  ( low  graphs)  for  the  driven  cavity  simulation. 


Table  4.2 

Average  convergence  factors  obtained  in  the  simulation  of  the  driven  cavity  problem  for  several  uniform  (U)  and  stretched 
(S)  grids,  Reynolds  numbers  and  different  F(l,l)  cycles. 


Grid 

32x32x32 

64x64x64 

S-APS 

s-zsc 

Tri-APS 

Tri-ZSC 

S-APS 

s-zsc 

Tri-APS 

Tri-ZSC 

Re  =  102  U 

1.6  10~3 

0.07 

0.14 

0.24 

3  1CT3 

0.04 

0.17 

0.21 

Re  =  102  S 

i.4  icr3 

0.04 

0.08 

0.23 

7  10"3 

0.04 

0.07 

0.20 

Re  =  103  U 

0.17 

0.21 

0.25 

0.5 

0.10 

0.18 

0.34 

0.5 

Re  =  10s  S 

0.15 

0.21 

0.3 

0.46 

0.10 

0.15 

0.34 

0.46 

4.2.  3D  Flat  Plate  Boundary  Layer.  We  consider  an  square  plate  placed  in  the  middle  of  the 
solution  domain.  In  the  west  face  {x  =  0)  we  define  the  inflow  boundary  with  no  angle  of  attack,  and  so 
the  east  face  will  hold  the  outflow  condition.  On  the  plate  a  no-slip  boundary  condition  is  imposed,  and 
symmetric  condition  is  imposed  elsewhere  on  the  domain  boundary.  As  the  velocity  gradient  normal  to  the 
wall  is  very  high  only  in  the  boundary  layer,  the  thin-layer  approximation  which  only  retains  those  terms 
can  be  adopted.  However  in  the  following  simulations  the  original  form  (2.1)  of  the  Navier-Stokes  equations 
is  solved. 

In  order  to  capture  the  viscous  effects,  the  grid  is  highly  stretched  near  the  plate  (see  left-hand  chart 
in  figure  4.6).  Moreover,  the  grid  is  refined  near  the  plate  edges  to  reduce  the  large  discretization  errors 
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in  those  zones  as  advocated  by  Thomas  et  al.  [20].  To  ensure  that  a  sufficient  number  of  grid  points  will 
lie  inside  the  boundary-layer,  the  mesh  space  for  a  uniform  mesh  would  impose  too  high  a  demand  on  the 
computation.  For  example,  approximating  the  boundary-layer  thickness  with  S  ~  for  Re  =  104  we  have 
S  ~  0.01  which  implies  at  least  102  grid  points  in  a  uniform  grid  which  cannot  be  considered  due  to  memory 
limitations  in  a  3-D  simulation.  Thus  for  this  model  problem,  no  regular  grids  will  be  considered.  The  grids 
are  stretched  in  the  ^-direction  using  a  geometric  factor  hk  —  (3hk~ i  with  (3  =  1.3  for  the  24x24x32  grid  and 
(3  —  1.1  for  the  48x48x64  grid. 


Fig.  4.6.  48x48x32  grid  used  for  the  flat-plate  simulation  (left-hand  chart).  Pressure  contour  lines  and  boundary  layer 
for  Re  =  104  (right-hand  chart). 

The  solution  is  verified  by  comparing  the  u- velocity  in  the  middle  of  the  plate  with  the  Blasius  analytical 
solution  for  a  2-D  plate  (figure  4.7).  The  little  discrepancy  near  the  layer  edge  is  due  to  the  highly  stretched 
grid  used  in  this  simulation.  The  second-order  accuracy  has  been  verified  using  the  solution  in  the  three 
finest  grids. 


Fig.  4.7.  Simulation  comparison  with  Blasius  theory  at  the  middle  of  the  plate  with  Re  —  104. 

The  multigrid  cycle  employed  to  solve  each  level  of  the  FMG  is  a  F(2,l)  cycle.  The  under-relaxation 
factors  used  in  the  simulations  are  shown  in  table  4.3.  Depending  on  the  problem,  some  plane  sweep 
directions  may  deteriorate  the  smoothing.  The  best  smoothing  rate  was  achieved  with  a  combination  of  xy- 
plane  relaxation  and  z  semi-coarsening  (ZSC).  We  do  not  include  results  of  the  alternating-plane  approach 
or  other  semi-coarsening  directions  because  of  their  poor  behavior. 
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Table  4.3 

Under-relaxation  factors  for  the  flat-plate  simulation  as  a  function  of  the  grid  size ,  Reynolds  number  and  multigrid  cycle. 


Grid 

Reynolds  Number 

24x24x32 

48x48x64 

L-ZSC 

Tri-ZSC 

L-ZSC 

Tri-ZSC 

102 

0.8 

0.8 

0.8 

0.8 

104 

0.6 

0.4 

0.6 

0.4 

Figure  4.8  shows  the  L2-norm  of  the  residual  versus  F(2,l)-cycles  with  lexicographic  and  tri-plane  xy- 
plane  implicit  smoothers  combined  with  Z  semi-coarsening  (ZSC)  for  several  grids  and  Reynolds  numbers. 
The  residual  norm  is  reduced  by  nearly  five  orders  of  magnitude  in  the  first  five  cycles  in  all  cases  (note 
that  the  reduction  is  of  four  orders  of  magnitude  in  the  first  two  cycles  for  the  48x48x64  grid).  In  fact,  the 
full  multigrid  algorithm  converges  the  solution  to  below  the  truncation  error  with  one  F(2,l)  cycle  per  level. 
The  asymptotic  convergence  rate  is  equal  to  0.19  which  is  close  to  that  obtained  for  the  Poisson  equation 
with  the  semi-coarsened  approach  [13]. 


Fig.  4.8.  L2-norm  of  the  residual  versus  F(2,l)-cycles  with  lexicographic  (top  graphs)  and  tri-plane  (bottom  graphs) 
xy-plane  implicit  smoothers  combined  with  z  semi- coarsening  (ZSC)  for  several  grids  and  Reynolds  numbers. 


Table  4.4  shows  the  average  convergence  factors  obtained  in  the  simulation  of  the  flat  plate  problem 
for  different  F(2,l)  cycles  and  several  stretched  grids  and  Reynolds  numbers.  Although  not  included  in  this 
report,  experiments  with  Reynolds  numbers  up  to  106  were  performed.  Convergence  rates,  independent  of 
the  Reynolds  number,  the  grid  size  and  the  stretching  factor,  were  achieved  for  the  resolution  of  the  boundary 
layer  over  a  flat  plate.  Since  the  flow  is  aligned  with  the  grid,  the  results  obtained  with  the  lexicographic 
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